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Multilayer Hadamard Decomposition of the Discrete Hartley 

Transform 

H. M. de Oliveira* R. J. Cintra^ R. M. Campello de Souza^ 


Abstract 

Discrete transforms such as the discrete Fourier transform (DFT) or the discrete Hartley transform (DHT) furnish 
an indispensable tool in signal processing. The successful application of transform techniques relies on the existence 
of the so-called fast transforms. In this paper some fast algorithms are derived which meet the lower bound on 
the multiplicative complexity of the DFT/DHT. The approach is based on a decomposition of the DHT into layers 
of Walsh-Hadamard transforms. In particular, fast algorithms for short block lengths such as A £ {4,8,12,24} are 
presented. 
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1 Introduction 


Discrete transforms defined over finite or infinite fields have been playing a relevant role in numerical analysis. A 
striking example is the discrete Fourier transform (DFT), which has found applications in several areas, especially in 
signal processing. Another relevant example concerns the discrete Hartley transform (DHT) |[T), the discrete version 
of the integral transform introduced by Hartley in Q. Besides its numerical side appropriateness, the DHT has proven 
over the years to be a powerful tool M- A decisive factor for applications of the DFT has been the existence of 
the so-called fast transforms (FT) for computing it 0 - Fast Hartley transforms also exist and are deeply connected 
to the DHT applications |j7j[^. Recent promising applications of discrete transforms concern the use of finite field 
Hartley transforms Q to design digital multiplex systems, efficient multiple access systems m and multilevel spread 
spectrum sequences HD- 

Fast algorithms that present low multiplicative complexity are of relevant interest to community. Very efficient 
algorithms such as the prime factor algorithm (PFA) or Winograd Fourier transform algorithm (WFTA) have also 
been used 112 131. Another particular class of algorithms that aims at low multiplicative complexity is the arithmetic 
Fourier transforms (AFT) HD- The minimal multiplicative complexity, /t, of the one-dimensional DFT for all possible 
sequence lengths, N, can be computed by converting the DFT into a set of multi-dimensional cyclic convolutions. A 


lower bound on the multiplicative complexity of the DFT is given in 115 Theorem 5.4, p. 98]. For some short 
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Table 1; Minimal multiplicative complexity for computing the DFT of length N 


N 

fi(N) 

4 

0 

8 

2 

12 

4 

24 

12 


blocklengths, the values of IJ.{N) are given in Table(some local minima of /r). The discrete Hartley transform of a 
signal Vi, i = 0,1,2,... ,N — I is defined by: 


f2mk\ 

Vk= 


k = Q,l,...,N-l. 


i=0 


where cas(x) = cos{x) + sin(x) is the “cosine and sine” Hartley symmetric kernel. 

In this paper, we aim at the introduction of fast algorithms that meet the minimal multiplicative complexity. There is 

r n T 

a simple relationship between the DHT and the DFT spectra of a given real discrete signal v = 


/ = 0,1,... — 1. Let 


Uo Ui 


respectively. Then, we have that 


VO VI 


Vw-l 


Un- 


and 


Vo Vi 


Viv- 


be the DFT and DHT spectra of v. 


Vk = 3i{Uk}-3{Uk}, 

jr Vk + VN-k —j ■ {Vk — VN-k) 
Uk = ---“T-"• 


Therefore, an FFT algorithm for the DHT is also an FFT for the DFT and vice-versa 115 Corollary 6.9]. Besides being 


a real transform, the DHT is also an involution, i.e., the kernel of the inverse transform is exactly the same as the one 
of the direct transform (self-inverse transform). We exploit the DHT symmetry to derive fast algorithms that attain 
the theoretical minimal number of real floating-point multiplications. The idea behind our approach is to carry out the 
DHT decomposition based on classical transforms by Hadamard m- 

In this work, we adopt the following notation. The input signal is denoted as v. The DHT spectrum of v is V = 

n T 


Vo Vi 


Vv-i 


The DHT matrix of size N is referred to as whose (!,k)th entry is given by hik = 


cas( "^(' ^> ),i,k=l,2,...,jV 


2 Computing the 4-point DHT 


For = 4, we have the matrix formulation V = T 4 • v, which is given by 


‘v,l 


V2 


V3 






It is therefore equivalent to the 4-point Walsh-Hadamard transform. Thus it has null multiplicative complexity. 
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Figure 1: (a) Diagram for the 2-point Walsh-Hadamard transform and (b) Diagram for the 4-point DHT based on 
Walsh-Hadamard transform. Small circles at the summation boxes indicate the subtraction operation (invert the sign 
of the input) and the “H” blocks denote the Hadamard transform. 


Figure [T] shows the signal flow diagram of the 4-point DHT in terms of 2-point Walsh-Hadamard transforms. The 
complexity for the 4-DHT is given by 8 additions and zero multiplications. 


3 Computing the 8-point DHT 


Let Si{Q) = Vi, i = 0,(input data). The 0-order “pre-additions” are, respectively, {5o(0) = vo,5i(0) = 
vi,52(0) = V2,53(0) = V3,54(0) = V4,55(0) = V5,56(0) = V6,57(0) = V7}. Thus, 8-point DHT matrix can be writ¬ 
ten as: 


vr 


1 1.4142 1 

0 

-1 

- 1.4142 -1 

0 


Vo 

V2 


1 1 -1 

-1 


1 -1 

-1 


Vl 

V3 


1 0 -1 

1.4142 

-1 

0 1 

- 1.4142 


V2 

V4 


1 -1 1 

-1 


-1 1 

-1 


V3 

V5 


1 - 1.4142 1 

0 

-1 

1.4142 -1 

0 


V4 

Vs 


1 -1 -1 

1 


-1 -1 



V5 

V7 


1 0 -1 

- 1.4142 

-1 

0 1 

1.4142 


Vs 

Vs^ 


1 1 1 

1 


1 1 



V 7 _ 


We remark that 


which follows from the addition of arcs formula: cas(a — j3) = cos(j3) • cas(a) — sin(j3) • cas'(a), where cas'(-) is 
the complementary cas function cas'(a) = cos(a) — sin(a) [3]. We notice that the absolute value of the elements of 
the 2nd column are identical to the corresponding elements at the 6th column; the same for the 3th column and 7th 
column. We can thus consider new variables (vi + V5) and (vi — V5) instead of vi and V5; (v 2 -f vg) and (v 2 — vg) instead 
of V 2 and vg, and so on. Thus, we obtain: 


‘ 5 o(l) = (VO + V4), 

5 l(l) 

52(1) = (V 2 + Vg), 

53(1) 

54(1) = (vi -f vs), 

Ssil) 

‘^ 6 ( 1 ) = (PS + 1 '?), 

Slil) 


(v’O-v’4), 

(P2-V6), 

(vi-vs), 

(V3-V7). 
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Figure 2: The 8-point DHT signal flow diagram. 


We refer to the above set of equations as the Ist-order pre-additions. The first-order pre-additions effects several null 
elements in the implied new transform matrix. Although such an implementation requires only two multiplications, 
we may go further and combine other columns, resulting in a alternative 2nd-order pre-additions as follows: 


5o(2) = (V0-I-V4), 5i(2) = (vo- V4), 

‘^2(2) = (v2-f ve), ‘^3(2) = (v2-ve), 

54 ( 2 ) = (vi -I- V5) -I- (V3 -I- Vy), 85 ( 2 ) = (vi -I- V5) - (V3 -f Vy), 

56 ( 2 ) = (vi - V5) -I- (V3 - vy), 5 y ( 2 ) = (vi - V5) - (v 3 - Vy). 


Thus, we have: 


vr 


'0 

1 

0 

1 

0 

0 

.707 

.707 ' 


"So (2)' 

Vy 


1 

0 

-1 

0 

0 

1 

0 

0 


Si(2) 

V3 


0 

1 

0 

-1 

0 

0 

.707 

-.707 


§2(2) 

V4 


1 

0 

1 

0 

-1 

0 

0 

0 


§3(2) 

V5 


0 

1 

0 

1 

0 

0 

-.707 

-.707 


§4(2) 

Vfi 


1 

0 

-1 

0 

0 

-1 

0 

0 


§5(2) 

Vy 


0 

1 

0 

-1 

0 

0 

-.707 

.707 


§6(2) 

Vs. 


1 

0 

1 

0 

1 

0 

0 

0 


§7(2) 


The pre-additions terms can be implemented by Walsh-Hadamard instantiations. A scheme for the implementation 
of the 8-point DHT is shown in Figure]^ where only two multiplications by V2/2 — 0.707... are required. The 
algorithm complexity for computing the 8-point DHT is 22 additions and 2 multiplications. 


4 Computing the 12-point DHT 


The 0-order pre-additions (data) are defined as 5,(0) = v/, / = 0,1,... ,A — 1. The Hartley spectrum can be computed 

r 1 T 

according to V = T(0) -8(0), where T(0) = H 12 and 8(0) = 5o(0) 5i(0) ••• 5ii(0) . Applying the same 
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reasoning of the previous section, we define: 


5o(l) = V0 + V6, 

52(1) = V3 + V9, 
54(1) = VI +V7, 
‘^6(1) = V’2 + V8, 
58(1) = V4 + V10, 
5lo(l) = V5 + Vll, 


‘5l(l) = V0-V6, 

53(1) = V 3 - V 9 , 

‘55(1) = V1-V7, 
57(1) = V2 — V8, 
59(1) = V4 —Vio, 

5 ll(l) = V 5 — Vll 


The resulting transform is: 


>,' 


’0 

1 

0 

1 

0 

1.366 

0 

1.366 

0 

.366 

0 

-.366 ’ 


I'd +1^6 

F, 


1 

0 

-1 

0 

1.366 

0 

.366 

0 

-1.366 

0 

.366 

0 



V, 


0 

1 

0 

-1 

0 

1 

0 

-1 

0 

1 

0 

1 


V3 +Vg 

V , 


1 

0 

1 

0 

.366 

0 

-1.366 

0 

.366 

0 

-1.366 

0 


V3 - Vg 



0 

1 

0 

1 

0 

-.366 

0 

-.366 

0 

-1.366 

0 

1.366 


Vi + Vj 

y. 


1 

0 

-1 

0 

-1 

0 

1 

0 

1 

0 

-1 

0 


Vi -V7 

y. 


0 

1 

0 

-1 

0 

-1.366 

0 

1.366 

0 

.366 

0 

.366 


Vj + Vg 

F, 


1 

0 

1 

0 

-1.366 

0 

.366 

0 

-1.366 

0 

.366 

0 


Vj — Vg 

y. 


0 

1 

0 

1 

0 

-1 

0 

-1 

0 

1 

0 

-1 


1^4 + 1^10 

y,n 


1 

0 

-1 

0 

— .366 

0 

-1.366 

0 

.366 

0 

1.366 

0 


’'a - I'm 

Ki 


0 

1 

0 

-1 

0 

.366 

0 

-.366 

0 

-1.366 

0 

-1.366 


Vj + 

yu_ 


1 

0 

1 

0 

1 

0 

1 

0 

1 

0 

1 

0 


_''j - I'll _ 


Above matrix is denoted as T(l). Therefore, this equation can be written as V = T(1)-S(1), where S(l) = 

iT 


5o(l) 5i(l) 
(layer #2): 


Sn{l) 


Observing the remaining symmetries, we also define the 2nd-order pre-additions 


5o(2) =Vo + V 6 , 5 i( 2 ) =V0-V6, 52(2) = V3 + V9, 53(2) = V3-V9, 
54(2) = (vi +V7) + (v4 + vio),55 (2) = (vi +V7)- (v4 + vio), 
‘^ 6 ( 1 ) = (vi -V7) + (v 2-V8),57(2) = (vi - V7)- (V2-V8), 

58 ( 2 ) = (V2 + V8) + (V5 +Vll),59(2) = (V2 + V8)- (V5+V11), 
5 io( 2 ) = (v 4 -vl 0 )-f (V5-Vii),5ii(2) = (v 4 - vio) - (vs - vn). 

We have then: 


[v.] 


0 


0 

1 

0 

0 

1.366 

0 

0 

0 

0 

.366 



Vq 

+ Vg 


V 2 



0 

-1 

0 

0 

1.366 

0 

0 

0 

.366 

0 

0 



Vo 

+ Vg 


v. 


0 


0 

- 1 

0 

0 

0 

1 

0 

0 

1 

0 



V 3 

+ Vg 


V 4 



0 


0 

.366 

0 

0 

0 

- 1.366 

0 

0 

0 



V 3 

- V 9 


V 5 


0 


0 

1 

0 

0 

-.366 

0 

0 

0 

0 

- 1.366 



+ V7^ 

+ (v4 

+ Vio) 




0 

- 1 

0 

0 

- 1 

0 

0 

0 

1 

0 

0 





+ Vio) 

V 7 


0 


0 

1 

0 

0 

0 

1.366 

0 

0 

.366 

0 


V, 

-V 7 

+ (V2 

-Vs) 

Vs 



0 


0 

- 1.366 

0 

0 

0 

.366 

0 

0 

0 


V, 

- V 7 

-(V2 

-Vg) 

V, 


0 


0 

1 

0 

0 

- 1 

0 

0 

0 

0 

1 



+ Vg 

+ 05 

+ Vii| 

Vio 



0 

-1 

0 

0 

- .366 

0 

0 

0 

- 1.366 

0 

0 


L 

+ Vg 

- (v5 

+ Vll) 

Vll 


0 


0 

-1 

0 

0 

0 

.366 

0 

0 

- 1.366 

0 


V4 

-Vio 

1+(V5 

-Vll) 

VI 2 



0 


0 

1 

0 

0 

0 

1 

0 

0 

0 


(V4 

- Vio 

l-(v5 

-Vll) 
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The spectrum can be computed in terms of the 2nd layer pre-additions as V = T (2) • S (2), where T (2) is the 12x12 

r tT 


matrix above and S (2) = (2) 5i(2) ••• 5ii(2)J . 

There is no pair of non-combined identical columns left (signs of elements not considered). However, the integer 
part of the elements greater than unity into the T (2) matrix can be handled separately. Spectral component substitutions 
to take into account the special addition to balance the matrix is shown below: 


Vl 


[(Vl - V7) + (V2 - vg)] = 56 ( 2 ) 

V2 


[(vi +V7)- (v4-fvi0)] =55(2) 

V3 


0 

V4 


-[(V2 + V8) + (V5-|-Vll)] = - 5 g( 2 ) 

V5 

-7 

-[(V4-V1O)- (V5-V1I)] = - 5 il( 2 ) 

Ve 


0 

V7 


-[(Vl -V7) - (V2-V8)] = -57(2) 

Vg 


— [(vi -f V7) + (v4 -f VlO)] = —54(2) 

V9 

— > 

0 

Vio 


-[(v2 + Vg)-(v5-fVll)] = -59(2) 

Vn 

-7 

-[(V4-Vl0)-|-(V5-Vll)] = - 5 io( 2 ) 

V12 

-7 

0 


The procedure of combining pair of columns can be iterated yielding the following new pre-addition sets: (3rd- 
order pre-additions (layer #3)) 

5 o( 3 ) =V 0 + V 6 , 5 i( 3 ) = V 0 -V 6 , 52 ( 3 ) = V 3 -fV 9 ,53(3) = V 3 -V 9 , 

54(3) = [(vi -f V7) -f (v4 + Vio)] + [(V2 + vg) -f (V5 -f Vii)], 

55(3) = [(Vl -f V 7 ) + (V 4 + V 10 )] - [(v 2 -f V 8 ) + (v 5 -f Vil)], 

56 ( 3 ) = [(vi +V7)- (V4 + V10)] + [(V2-I-V8) - (V5-I-V11)], 

‘^7(3) = [(Vi +V7)- (v 4 -f Vio)] - [(v2-f Vg) - (v5-f Vii)], 

•^ 8 ( 3 ) = [(vi - V7)- (V2-V8)] -|-[(V 4 -Vlo)-|-(v5 -Vil)], 

59(3) = [(Vl -V7)- (V2-V8)] - [(V 4 -Vlo)-f (V5 - Vll)], 

•^10(3) = [(vi -V 7 ) + (V 2 -V 8 )]-|-[(V 4 -Vl 0 )-(V 5 -Vil)], 

•^11(3) = [(Vl - V 7 )-f (V 2 -V 8 )] - [(V 4 -Vio) - (V5 -Vn)]. 
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Figure 3: The 12-point DHT fast algorithm diagram. 


The final relationship between the Hartley spectrum and the pre-additions can be established; 


x' 


0 

1 

0 

1 

0 

0 

0 

0 

0 

0 

.366 

0 ■ 




■ 8,(2) ■ 

V, 


1 

0 

-1 

0 

0 

0 

.366 

0 

0 

0 

0 

0 


S,(3) 


8 ,(2) 

V, 


0 

1 

0 

-1 

0 

0 

0 

0 

1 

0 

0 

0 


8,(3) 


0 

X 


1 

0 

1 

0 

0 

.366 

0 

0 

0 

0 

0 

0 


S,(3) 


-8,(2) 

V, 


0 

1 

0 

1 

0 

0 

0 

0 

0 

0 

0 

0 


8,(3) 


-8X2) 

V. 


1 

0 

-1 

0 

0 

0 

0 

-1 

0 

0 

-.366 

0 


S,(3) 

+ 

0 

X 


0 

1 

0 

-1 

0 

0 

0 

0 

0 

-.366 

0 

0 


S.(3) 


-8,(2) 

X 


1 

0 

1 

0 

0 

-.366 

0 

0 

0 

0 

0 

0 


S,(3) 


-8,(2) 

V, 


0 

1 

0 

1 

0 

0 

0 

0 

0 

0 

0 

-1 


8,(3) 


0 

X. 


1 

0 

-1 

0 

0 

0 

-.366 

0 

0 

0 

0 

0 


8,(3) 


-8,(2) 

Xi 


0 

1 

0 

-1 

0 

0 

0 

0 

0 

.366 

0 

0 


8,0 (3) 


-8X2) 

x._ 


1 

0 

1 

0 

1 

0 

0 

0 

0 

0 

0 

0 


8„(3) 


0 


The only four real floating-point multiplications required are x [^5(3), 56(3),59(3),5io(3)]. Notice that « 
0.366... The corresponding block diagram is sketched in Figure j^below. The complexity of the suggested implemen¬ 
tation is given by 52 additions and 4 multiplications. 
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5 Computing the 24-point DHT 


Following the similar steps as before, the 0-order pre-additions are defined as 5,(0) = v,-, i = 0,1,... ,23. We have the 
expression below; 


Vo 


'1 1 

1 

1 

1 

1 1 

1 

1 

1 

1 

1 1 

1 

1 

1 

1 

1 1 

1 

1 

1 

1 

1 


VC 

Vi 


1 1.224 

1.366 

1 414 

1.366 

1.224 1 

.707 

.366 

0 

-.366 

-.707 -1 

-1.224 

-1.366 

-1.44 

-1.366 

-1.224 -1 

-.707 

-.366 

0 

366 

707 


Vl 

V2 


1 1.366 

1.366 

1 

.366 

-.366 -1 

-1.366 

-1.366 

-1 

-.366 

.366 1 

1.366 

1.366 

1 

.366 

-.366 -1 

-1.366 

-1.366 

-1 

-.366 

.366 


V2 

V 3 


1 1.414 

1 

0 

-1 

-1414 -1 

0 

1 

1 414 

1 

0 -1 

-1 414 

-1 

0 

1 

1.414 1 

0 

-1 

-1 414 

-1 

0 


V 3 

V 4 


1 1.366 

.366 

-1 

-1.366 

-.366 1 

1.366 

.366 

-1 

-1.366 

-.366 1 

1.366 

.366 

-1 

-1.366 

-.366 1 

1.366 

.366 

-1 

-1.366 

-.366 


V 4 

V 5 


1 1.224 

-.366 

-1.414 

-.366 

1.224 1 

-.707 

-1.366 

0 

1.366 

.707 -1 

-1.224 

.366 

1.414 

.366 

-1.224 -1 

.707 

1.366 

0 

-1.366 

-.707 


V 5 

Vd 

V 7 


1 .707 

-1.366 

0 

1.366 

-.707 -1 

1.224 

.366 

-1.414 

.366 

1.224 -1 

-.707 

1.366 

0 

-1.366 

.707 1 

-1.224 

-.366 

1.414 

-.366 

-1.224 


Vd 

V 7 

Vo 


1 366 

-1.366 

1 

366 

-1 366 1 

.366 

-1.366 

1 

.366 

-1.366 1 

.366 

-1.366 

1 

.366 

-1.366 1 

366 

-1 366 

1 

366 

-1 366 


vg 

V 9 


1 0 

-1 

1.414 

-1 

0 1 

-1.414 

1 

0 

-1 

-1.414 -1 

0 

1 

-1.414 

1 

0 -1 

1.414 

-1 

0 

1 

-1.414 


vp 

VlO 


1 -.366 

-.366 

1 

-1.366 

1.366 -1 

.366 

.366 

-1 

1.366 

-1.366 1 

- 366 

-.366 

1 

-1.366 

1.366 -1 

.366 

366 

-1 

1.366 

-1 366 


VlO 

Vll 


1 -.707 

.366 

0 

-.366 

.707 -1 

1.224 

-1.366 

1.414 

-1.366 

1.224 -1 

.707 

-.366 

0 

.366 

-.707 1 

-1.224 

1.366 

-1.414 

1.366 

-1.224 


Vll 

V 12 


1 -1 

1 

-1 

1 

-1 1 

-1 

1 

-1 

1 

-1 1 

-1 

1 

-1 

1 

-1 1 

-1 

1 

-1 

1 

-1 


V 12 

Vi3 


1 -1.224 

1.366 

-1.414 

1.366 

-1.224 1 

-.707 

.366 

0 

-.366 

.707 -1 

1.224 

-1.366 

1.414 

-1.366 

1.224 -1 

.707 

-.366 

0 

.366 

-.707 


Vl3 

Vl4 


1 -1.366 

1.366 

-1 

.366 

.366 -1 

1.366 

-1.366 

1 

-.366 

-.366 1 

-1.366 

1.366 

-1 

.366 

.366 -1 

1.366 

-1.366 

1 

-.366 

-.366 


Vl4 

Vi5 


1 -1.414 

1 

0 

-0 

1414 -1 

0 

1 

-1.414 

1 

0 -1 

1.414 

-1 

0 

1 

-1.414 1 

0 

-1 

1.414 

-1 

0 


Vis 

V 16 


1 -1.366 

.366 

1 

-1.366 

.366 1 

-1.366 

.366 

1 

-1.366 

.366 1 

-1.366 

.366 

1 

-1.366 

.366 1 

-1.366 

.366 

1 

-1.366 

.366 


VI 6 

Vi7 


1 -1.224 

-.366 

1 414 

-.366 

-1.224 1 

.707 

-1.366 

0 

1 366 

-.707 -1 

1.224 

.366 

-1.414 

.366 

1.224 -1 

.707 

1.366 

0 

-1 366 

707 


Vi7 

V 18 


1 -1 

-1 

1 

1 

-1 -1 

1 

1 

-1 

-1 

1 1 

-1 

-1 

1 

1 

-1 -1 

1 

1 

-1 

-1 

1 


VI 8 

Vl9 


1 -.707 

-1.366 

0 

1.366 

707 -1 

-1.224 

.366 

1 414 

.366 

-1.224 -1 

.707 

1.366 

0 

-1.366 

-.707 1 

1.224 

-.366 

-1 414 

.366 

1.224 


Vl9 

V 20 


1 -.366 

-1.366 

-1 

.366 

1.366 1 

-.366 

-1.366 

-1 

.366 

1.366 1 

-.366 

-1.366 

-1 

.366 

1.366 1 

-.366 

-1.366 

-1 

.366 

1.366 


V20 

V 21 


1 0 

-1 

-1 414 

-1 

0 1 

1 414 

1 

0 

-1 

-1.414 -1 

0 

1 

1.414 

1 

0 -1 

-1.414 

-1 

0 

1 

1.414 


V 2 I 

V 22 


1 .366 

-.366 

-1 

-1.366 

-1.366 -1 

-.366 

.366 

1 

1.366 

1.366 1 

.366 

-.366 

-1 

-1.366 

-1.366 -1 

-.366 

.366 

1 

1.366 

1.366 


V22 

723. 


1 .707 

.366 

0 

-.366 

-.707 -1 

-1.224 

-1.366 

-1.414 

-1.366 

-1.224 -1 

-.707 

-.366 

0 

.366 

.707 1 

1.224 

1.366 

1.414 

1.366 

1.224 


.''23. 


Going further, the Ist-order pre-additions (layer #1) are: 

5o(l) = V0-|-V12,5 i(1) = Vq —V12,52(1) = VI -|-V13,53(1) = Vi — Vl3, 

54 ( 1 ) = V 2 -I- V14,55(1) = V 2 — V 14 , 56 ( 1 ) = V 3 -f V 15 , 57 ( 1 ) = V 3 — V 15 , 

58 ( 1 ) = V4 -I- Vi6,59(1) = V4 — Vi6,5io(1) = V5 -I- V17,5ii(1) = V5 — V 17 , 
5i2(1) = Vfi -I- V18,5i3(1) = Vg — V18,5i4(1) = V7 -I- V19,5i5(1) = V7 — Vl9, 

‘^ 16 ( 1 ) = T8 + V20,5i7(1) = V8 — V20,5i8(1) = V9 -f V21 ,5i9(1 ) = V9 — V21, 

52o(1) = VlO + V22,52i(1) =V10 — V 22 ,522(1) = Vn -I- V23,523(1) = Vn — V 23 - 



A new set of pre-addition can be considered. Let the 2nd-order pre-additions be; 


5o(2)=5o(1),5i(2)=5i(1),52(2)=5i2(1),53(2)=5i3(1), 

54(2 ) = 52(1) +5i4(1), 55(2 ) = 52(1) -5i4(1), 

56(2) =53(1)+5ii(1), 57 ( 2 ) =53(1)-5ii(1), 

58(2)= 54(1)+5 i 6(1),59(2)= 54(1)-5 i 6(1), 
5io(2)=55(1)+59(1),5ii(2)=55(1)-59(1), 

5i2(2) = 58(1) +52o(1),5i 3(2) = 58(1) -52o(l), 

5i 4(2) = 5io(l) +522(1),5 i 5(2) = 5io(l) -522(1), 

5i6(2) = 5i5(1) +523(1),5i 7(2) = 5i5(l) -523(1), 
5 i8(2)=5i7(1)+52i(1),5i9(2)=5i7(1)-52i(1), 
52o(2)=56(1)+5i8(1),52i(2)=56(1)-5i8(1), 

522(2) = 57(1 ) +5i9(1),523(2) = 57(1 ) -5i9(1). 

Again, we have a few cases where the pair do not match perfectly. Applying the same strategy adopted in the 12- 
blocklength case, we put apart some matrix components in order to “balance” the matrix. The 3rd-order pre-additions 
follows: 


5o(3) = 5o(2),5i(3) = 5i(2),52(3) = 52(2), 53 ( 3 ) = 53 ( 2 ), 
54(3)=52o(2),55(3)=52i(2), 

56(3) = 54(2 ) +5i2(2),57(3) = 54(2 ) -5i2 (2), 

58(3) = 55(2) +59(2),59(3) = 55(2) -59(2), 

5io(3) =58(2)+5 i4(2),5ii(3) =58(2)-5i4(2), 

5i2(3) = 5i3(2) +5i5(2),5i 3(3) = 5i3(2) -5i5(2), 

5i4 (3 ) = 522 (2) + 523 (2), 5i5 (3 ) = 522 (2) - 523 (2), 

5i 6(3) = 5io(2) +5 i9(2),5i 7(3) = 5io(2) -5i9(2), 

5i 8(3) = 5ii(2) +5 i8(2),5i 9(3) = 5ii(2) -5i8(2), 

52o(3) = 56(2),521 (3) = 57 ( 2 ),522(3) = 5i6(2),523(3) = 5i7(2). 
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The special addition vector required in this step is written as follows: 


'Vo' 


11111 


So ( 2 ) 


0 

^1 


1 1 .707 .366 1.224 .707 


Si ( 2 ) 


Sid(2) 

V 2 


1 - 1 1 1.366 .366 


S 2 ( 2 ) 


0 

V 3 


1 -1 .707 1 1.414 


83 ( 2 ) 


0 

V 4 


1 1 -1 1.366 .366 


S2o(2) 


0 

V 5 


1 1 -.707 -.366 1.224 -.707 


S2i(2) 


-Sei(2) 



1 - 1-1 1 1 


S4(2) + Si2(2) 


0 

V 7 


1 -1 -.707 -.366 .707 1 224 


S4(2)-Si2(2) 


-Sii(2) 

Vs 


1 1 1 .366 - 1.366 


S 5 ( 2 ) + S 9 ( 2 ) 


0 

V 9 


1 1 .707 -1 -1.414 


85 ( 2 )-S 9 ( 2 ) 


0 

Vio 


1 - 1 1 -.366 - 1.366 


S8(2) + Si4(2) 


0 

Vn 


1 -1 .707 . 366 -.707 1.224 


S 8 ( 2 )-S, 4 ( 2 ) 


-3,gC2) 

V 12 


11-1 -1 1 


Si3(2) + Si5(2) 


0 

Vi3 


1 1 -.707 .366 -1.224 -.707 


Si3(2)-Si5C2) 


Sid(2) 

Vl4 


1 - 1 - 1 -1.366 .366 


S 22 ( 2 ) + S23(2) 


0 

Vl5 


1 -1 -.707 1 -1.414 


S 22 C 2 )- 823 ( 2 ) 


0 

V 16 


1 1 1 - 1.366 .366 


Sio(2)+Si9(2) 


0 

Vl7 


1 1 .707 -.366 -1.224 .707 


Sio(2)-Si9(2) 


-319(2) 

V 18 


1-11 -1 1 


Sii(2) + S,8(2) 


0 

Vl9 


1 -1 .707 -.366 -.707 -1.224 


Sii(2)-S,8(2) 


-SuC2) 

V 20 


1 1 - 1 -.366 - 1.366 


S 6 ( 2 ) 


0 

V 21 


1 1 -.707 -1 1.414 


S7(2) 


0 

V 22 


1 - 1 - 1 366 - 1.366 


Si6(2) 


0 

V 23 . 


1 -1 -.707 .366 .707 -1.224 


Si7(2) 


-Si8(2)_ 


The procedure of combining matched columns must be called once more. Making the following definitions, we 
get the final 4th-order pre-addition, remarking that—as in the previous iteration—another special addition vector must 
be separated, yielding: 


5o(4) =5o(3),5i(4) =5 i(3),52(4) =52(3),53(4) = 53 ( 3 ), 

54(4) =54(3),55(4) =55(3), 56 ( 4 ) = 5 i 7 ( 3 ),57(4) = 5 i 8 ( 3 ), 
58(4) = 56(3) +5io(3),59(4) = 58(3) -f 5 i3(3), 

5io(4) = 58(3) -5 i3(3),5ii( 4) = 56(3) -5io(3), 

5i2(4) = 59(3) +5i2(3),5i 3(4) = 57(3) +5ii(3), 

5i4(4) = 57(3) -5ii(3),5i 5(4) = 59(3) -5i2(3), 

5i6(4) = 5i4(3) +523(3),5i 7(4) = 5i4(3) -523(3), 

5i8(4)=5i5(3)+52i(3),5i9(4)=5i5(3)-52i(3), 

52o(4) = 5i6(3),52i( 4) = 5i9(3),522(4) = 52o(3),523(4) = 522(3). 
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'"0 

''12 

''6 

''18 

''3 

''13 

Vg 

''21 

''1 

''13 

''3 

''17 

''7 

''19 

''11 

''23 

''2 

''14 

''4 

''16 

''8 

''20 

''10 

''22 





1st order pre-adds 2nd order pre-adds 3rd order pre-adds 4th order pre-adds 



Figure 4: The 24-point DHT fast algorithm diagram. 


Deriving the DHT in terms of the fourth pre-addition layer, we obtain: 


Vo 


1111 


So (4) 


0 


0 

V, 


1 1 .707 .366 1.224 


Si(4) 


0 


Sio(2) 

Vo 


1 -1 1 .366 


82 ( 4 ) 


S8(3) 


0 

V 3 


1-1 1 .707 


83 ( 4 ) 


.707.S2i(3) 


0 

Vo 


11-1 .366 


84 ( 4 ) 


S7(3) 


0 

V 5 


1 1 -.707 -.366 1.224 


85 ( 4 ) 


0 


- 85 ( 2 ) 

Vo 


1 - 1-1 1 


So (4) 


0 


0 

V 7 


1 -1 -.707 -.366 1.224 


87 ( 4 ) 


0 


-Sii(2) 

Vs 


11 1 .366 


83 ( 4 ) 


-Sio(3) 


0 

V, 


1 1 -1 .707 


8,(4) 


-.707 -823 (3) 


0 

Vio 


1 -1 1 -.366 


810 ( 4 ) 


-Si3(3) 


0 

Vii 


1 -1 .707 .366 1.224 


Sll(4) 


0 


-Si3(2) 

Vii 


11-1 -1 


Si2(4) 


0 


0 

Vl3 


1 1 -.707 .366 -1.224 


Si3(4) 


0 


Sio( 2 ) 

Vh 


1 -1 -1 -.366 


Si4(4) 


-Sgi3) 


0 

Vl5 


1-1 - .707 


Si5(4) 


-.707- 821 ( 3 ) 


0 

Vi« 


111 - 366 


Si6(4) 




0 

Vl7 


1 1 .707 -.366 -1.224 


Si7(4) 


0 


-Si9(2) 

Vis 


1-11 -1 


Si8(4) 


0 


0 

Vio 


1 -1 .707 -366 -1,224 


Si,(4) 


0 


-Sii(2) 

V 21 J 


11-1 -.366 


820 ( 4 ) 


-511(3) 


0 

V 21 


1 1 -1 -.707 


S2i(4) 


.707 S23P) 


0 

V 22 


1 -1 -1 .366 


822 ( 4 ) 


-Si2(3) 


0 

V 23 


1 -1 -.707 .366 -1.224 


823(4) 


0 


- Si8(2)_ 


Because we have only twelve floating-point multiplication, the theoretic lower bound on the number of multiplications 
is achieved. The corresponding block diagram is depicted in Figure The complexity of the scheme is given by 
138 additions and 12 multiplications. 
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6 Conclusions 


Fast algorithms for the DHT capable of achieving the lower bound on the multiplicative complexity of the DFT/DHT 
are proposed. In particular, algorithms for short block lengths are presented. They are based on a multilayer decompo¬ 
sition of the DHT using Walsh-Hadamard transforms. Each Walsh-Hadamard transfomation implements pre-additions. 
These schemes are attractive and easy to implement using in low-cost high-speed dedicated integrated circuits or digital 
signal processors. 
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